Truth Discovery to Resolve Object Conflicts in Linked Data
نویسندگان
چکیده
In the community of Linked Data, anyone can publish their data as Linked Data on the web because of the openness of the Semantic Web. As such, RDF (Resource Description Framework) triples described the same real-world entity can be obtained from multiple sources; it inevitably results in conflicting objects for a certain predicate of a real-world entity. The objective of this study is to identify one truth from multiple conflicting objects for a certain predicate of a real-world entity. An intuitive principle based on common sense is that an object from a reliable source is trustworthy; thus, a source that provide trustworthy object is reliable. Many truth discovery methods based on this principle have been proposed to estimate source reliability and identify the truth. However, the effectiveness of existing truth discovery methods is significantly affected by the number of objects provided by each source. Therefore, these methods cannot be trivially extended to resolve conflicts in Linked Data with a scale-free property, i.e., most of the sources provide few conflicting objects, whereas only a few sources have many conflicting objects. To address this challenge, we propose a novel approach called TruthDiscover to identify the truth in Linked Data with a scale-free property. Two strategies are adopted in TruthDiscover to reduce the effect of the scalefree property on truth discovery. First, this approach leverages the topological properties of the Source Belief Graph to estimate the priori beliefs of sources, which are utilized to smooth the trustworthiness of sources. Second, this approach utilizes the Hidden Markov Random Field to model the interdependencies between objects to estimate the trust values of objects accurately. Experiments are conducted in the four datasets, which include people, locations, organizations, and descriptors, to evaluate TruthDiscover. Experimental results show that TruthDiscover outperforms TruthFinder, F-Quality Assessment and Voting in terms of accuracy when confronted with data having a scale-free property.
منابع مشابه
Truth Discovery with Memory Network
Truth discovery is to resolve conflicts and find the truth from multiple-source statements. Conventional methods mostly research based on the mutual effect between the reliability of sources and the credibility of statements, however, pay no attention to the mutual effect among the credibility of statements about the same object. We propose memory network based models to incorporate these two i...
متن کاملTruth Discovery and Crowdsourcing Aggregation: A Unified Perspective
In the era of Big Data, data entries, even describing the same objects or events, can come from a variety of sources, where a data source can be a web page, a database or a person. Consequently, conflicts among sources become inevitable. To resolve the conflicts and achieve high quality data, truth discovery and crowdsourcing aggregation have been studied intensively. However, although these tw...
متن کاملTruthDiscover: Resolving Object Conflicts on Massive Linked Data
Considerable effort has been made to increase the scale of Linked Data. However, because of the openness of the Semantic Web and the ease of extracting Linked Data from semi-structured sources (e.g., Wikipedia) and unstructured sources, many Linked Data sources often provide conflicting objects for a certain predicate of a real-world entity. Existing methods cannot be trivially extended to reso...
متن کاملPay-as-you-go Feedback in Data Quality Systems
In many domains such as the web, sensor networks and social media, sources often provide conflicting information. It is of utmost importance to resolve conflicts and identify correct information. A number of approaches, referred to as truth finders, have been proposed recently. They address the problem of truth discovery using different principles such as link analysis, Bayesian modeling and re...
متن کاملTruth Discovery Algorithms: An Experimental Evaluation
A fundamental problem in data fusion is to determine the veracity of multi-source data in order to resolve conflicts. While previous work in truth discovery has proved to be useful in practice for specific settings, sources’ behavior or data set characteristics, there has been limited systematic comparison of the competing methods in terms of efficiency, usability, and repeatability. We remedy ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1509.00104 شماره
صفحات -
تاریخ انتشار 2015